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Abstract 

Appropriate use of exploratory factor analysis (EFA) necessitates 



thoughtful 


researcher judgment 


concerning a 


number of 


analytic 


decisions . 


The present 


paper reviewed some 


of the fundamental 


decisions 


necessary to 


conduct 


an EFA and 


examined 


reporting 


practice in published 


research 


across four 


journals . 


Largely, 



insufficient information was given in published applications of 
EFA (n = 60) to allow external verification of the results. 
Researchers often utilized poor strategies to determine the number 
of factors to retain. In one-third of the cases, a confirmatory 
factor analysis was warranted over EFA. Other errors reporting and 
practice errors are noted. Several recommendations for improved 
EFA reporting practice are given. 
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A Meta-analytic Review of Exploratory Factor Analysis Reporting 

Practices in Published Research 

Researchers commonly attempt to explain the most with the 
least. For example, because all classical parametric analyses are 
part of a broader general linear model, these analyses are all 
correlational, yield r 2 type effect sizes, and maximize the shared 
variance between variables (e.g., regression) or between sets of 
variables (e.g., canonical correlation) (Bagozzi, Fornell & 
Larcker, 1981; Cohen, 1968; Henson, 2000; Knapp, 1978; Thompson, 
1991). In fact, classical parametric analyses (e.g., ANOVA, r, 
MANOVA, DDA) can all be performed with canonical correlation 
analysis, and thus are special cases of canonical analysis (Fan, 
1996, 1997; Thompson, 2000a). 

Because implicit within canonical correlation analysis itself 
is a principal components analysis (Thompson, 1984, pp . 11-14), 
all classical parametric analyses also invoke principal components 
analyses. This truism suggests the importance of factor analysis 
within statistics. 

In the interest of parsimony, researchers often strive to 
explain the most shared variance of measured variables using the 
fewest possible latent or synthetic variables. Such parsimonious 
solutions are generally considered to have greater external 
validity and, as such, are more likely to replicate. Thus, 
Kerlinger (1979) argued that factor analysis is "one of the most 
powerful methods yet for reducing variable complexity to greater 
simplicity" (p. 180). 

Factor analysis is often used to explain a larger set of 
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measured variables with a smaller set of k latent constructs. It 
is hoped, generally, that the k constructs will explain a good 
portion of the variance in the original ± x ± matrix of 

associations (e.g., correlation matrix) so that the constructs, or 
factors, can then be used to represent the observed variables. 
These constructs can be used as variables in subsequent analyses 
and "can be seen as actually causing the observed scores on the 
measured variables" (Thompson & Daniel, 1996, p. 202) . In short, 
"factor analysis is intimately involved with questions of 

validity. . . [and] is at the heart of the measurement of 

psychological constructs" (Nunnally, 1978, pp. 112-113). 

Historically, the theoretical framework for factor analysis 

is credited to Pearson (1901) and Spearman (1904), but practical 

\ 

application of the method is a modern phenomenon. As Kieffer 
(1999) noted: 

Spearman, through his work on personality theory, 
provided the conceptual and theoretical rationale 
for both exploratory and confirmatory factor 
analysis. Despite the fact that the conceptual 
bases for these methods have been available for 

many decades, it was not until the wide-spread 
availability of both the computer and modern 
statistical software that these analytic techniques 
were employed with any regularity, (p. 75) 

Thanks to the advent • of technology, factor analysis is now 
frequently employed in both measurement and substantive research. 

Given the proliferation of factor analysis applications in 
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the literature, the purpose of the present paper was to examine 
the utilization of factor analysis in current published research. 
Notwithstanding ease of analysis due to computers, the appropriate 
use of factor analysis requires a series of thoughtful researcher 
judgments. These judgments directly impact results and 
interpretations . 

Specifically, we examined across studies (a) the decisions 
made while conducting exploratory factor analyses and (b) the 
information reported from the analyses. In doing so, we present 
here the current status of factor analytic reporting practices and 
make recommendations for future practice as regards analytic 
decisions and reporting in empirical research. 

Exploratory Factor Analysis 

Modern conceptualizations of factor analysis include both 
exploratory and confirmatory methods, as well as the hybrid 
invoking exploratory factor extraction followed by confirmatory 
rotation (Thompson, 1992) . Exploratory factor analysis (EFA) is 
used to "identify the factor structure or model for a set of 
variables" (Bandalos, 1996, p. 389) . As its name implies, EFA is 
an exploratory method used to generate theory; researchers use EFA 
to search for the smaller set of k latent factors to represent the 
larger set of j_ variables. As Pedhazur and Schmelkin (1991) noted, 
"of the various approaches to studying the internal structure of a 
set of variables or indicators, probably the most useful is some 
variant of factor analysis" (p. 66) . 

On the other hand, confirmatory factor analysis (CFA) is used 
to test theory when the analyst has sufficiently strong rationale 
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regarding what factors should be in the data and what variables 
should define each factor. A fundamental and critically important 
difference between EFA and CFA is that results of an EFA are a 
sole function of the "mechanics and mathematics of the method" 
(Kieffer, 1999, p. 77) . EFA does not consider a priori theory held 
by the researcher (Daniel, 1989) . CFA, on the other hand, is 
driven by theoretical expectations regarding the structure of the 
data . 

As Gorsuch (1983) noted, "Whereas the former [EFA] simply 
finds those factors that best reproduce the variables under the 
maximum likelihood conditions, the latter [CFA] tests specific 
hypothesis regarding the nature of the factors" (p. 129) . The 
reader is referred to Gorsuch (1983), Stevens (1996), and 

Tabachnick and Fidell (1996) for extensive treatments of these 
approaches. The present chapter is concerned with EFA. 

Purposes of Factor Analysis 

Because the latent constructs, or factors, are thought to 
cause and summarize responses to observed variables, theory 
development and score validity evaluation are both closely related 
to factor analysis. As Hendrick and Hendrick (1986, p. 393) 

emphasized, "theory building and construct measurement are joint 
bootstrap operations." Factor analysis at once both tests 
measurement integrity and guides further theory refinement. 

As noted by Kieffer (1999), " [t]he utilization of factor 

analytic techniques in the social sciences has been indelibly 
intertwined [both] with developing theories and evaluating the 
construct validity of measures [i.e., scores]" (p. 75) . Regarding 
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construct validity, Gorsuch (1983) noted. 

Research proceeds by utilizing operational 
referents for the constructs of a theory to test if 
the constructs interrelate as the theory states.... 

A prime use of factor analysis has been in the 
development of both the operational constructs for 
an area and the operational representatives for the 
theoretical constructs, (p. 350) 

Factor analysis can be used to determine what theoretical 
constructs underlie a given data set and the extent to which these 
constructs represent the original variables. Of course, the 
meaningfulness of latent factors is ultimately dependent on 
researcher definition. As Mulaik (1987) suggested, "It is we who 
create meanings for things in deciding how they are to be used. 
Thus we should see the folly of supposing that EFA will teach us 
what intelligence is, or what personality is" (p. 301) . However, 

Thompson and Daniel (1996) noted that 

analytic results can inform the definitions we wish 
to create, even though we remain responsible for 
our elaborations and may even wish to retain the 
definitions that have not yet been empirically 
supported or that limited empirical evidence may 
even contradict, (p. 202) 

(Thoughtful) Researcher Judgment in EFA 
Despite its utility in both measurement and substantive 
research contexts, factor analysis has been criticized. Cronkhite 




and Liska (1980) observed. 
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Apparently, it is so easy to find semantic scales 
which seem relevant..., so easy to name or describe 
potential/hypothetical sources, so easy to capture 
college students to use the scales to rate the 
sources, so easy to submit those ratings to factor 
analysis, so much fun to name the factors when 
one's research assistant returns with the computer 
printout, and so rewarding to have a guaranteed 
publication with no fear of nonsignificant [sic] 
results that researchers, once exposed to the 
pleasures of the factor analytic approach, rapidly 
become addicted to it. (p. 102) 

Much of the criticism centers on the inherent subjectivity of 
the decisions necessary to conduct an exploratory factor analysis. 
Tabachnick and Fidell (1996) stated that " [o]ne of the problems 
with [principal components analysis] and [factor analysis] is that 
there is no criterion variable against which to test the solution" 
(p. 636) . Interpretation of results largely hinges on (hopefully) 
reflective researcher judgement. Tabachnick and Fidell also noted 
that after factor extraction, 

there is an infinite number of rotations available, 
all accounting for the same amount of variance in 
the original data, but with factors defined 
slightly differently. The final choice among 
alternatives depends on the researcher's assessment 
of its interpretability and scientific utility. In 
the presence of an infinite number of 
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mathematically identical solutions, researchers are 
bound to differ regarding which is best. Because 
the differences cannot be resolved by appeal to 
objective criteria, arguments over the best 
solution sometimes become vociferous, (p. 636) 

Because EFA "can be conceptualized as a series of steps which 
require that certain decisions be addressed at each individual 
stage... there are many different ways in which to conduct an EFA, 
and each different approach may render distinct results when 
certain conditions are satisfied" (Kieffer, 1999, pp . 76-77). 
Therefore, appropriate use of EFA necessitates thoughtful and 
informed researcher decision making. 

EFA Decisions 

A complete review of the steps and possible decisions 
necessary to conduct an EFA is beyond the scope of this chapter. 
However, a brief review is given here to place the current study 
in context. A comprehensive treatment is provided by Gorsuch 
(1983) . Hetzel (1996) and Kieffer (1999) presented briefer user- 
friendly primers on factor analysis. 

Matrix of Associations 

Because all classical statistical analyses are fundamentally 
correlational (cf. Cohen, 1968; Knapp, 1978), all analyses focus 
on a matrix of associations that describes the relationships 
between the variables in question. To conduct an EFA, the 
researcher must decide which matrix of associations (e.g., 
correlation, variance/ covariance) to analyze. Most statistical 
packages use the correlation matrix (with 1.0 on the diagonal) as 
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the default option in EFA. Subsequently, researchers tend to use 
the correlation matrix. 

Method of Factor Extraction 

There are multiple ways to extract factors. Principal 
components analysis (PCA) and principal axis factoring ( PAF) tend 
to be the most common. Factor extraction attempts to remove 
variance common to sets of variables from the original matrix of 
association. After the first factor (or common variance for a set 
of variables) has been extracted, a residual matrix remains. A 
second factor, which is orthogonal to the first, is then extracted 
from the residual matrix to explain as much of the remaining 
variance among the variables as possible. The process continues 
until noteworthy variance can no longer be explained by factors. 

The application of PCA as against PAF has been hotly debated. 
As Thompson and Daniel (1996) noted. 

Analysts differ quite heatedly over the utility of 
principal components as compared to common or 
principal factor analysis [i.e., PAF].... The 
differences between the two approaches involves the 
entries used on the diagonal of the matrix of 
associations that is analyzed. When a correlation 
matrix is analyzed, principal components analysis 
uses ones on the diagonal whereas common factor 
analysis uses estimates of reliability, usually 
estimated through an iterative process, (p. 201) 

Gorsuch (1983) suggested that the researcher consider carefully 
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which method to use, because differences can be meaningful. 

Thompson (1992) argued, however, that the practical 
difference between the methods is often negligible in terms of 
interpretation. Differences in results will decrease as (a) the 
measured variables have greater score reliability or (b) the 
number of variables measured increases. Regarding (a), the higher 
the score reliability for a variable, the closer the PAF entry on 
the diagonal is to one, which is what is used by PCA. 

Regarding (b) , as the number of variables increases, so does 
the total number of entries on the matrix of association. The 
influence of the diagonal entries then has less influence on the 
solution, because the proportion of entries on the diagonal 
decreases exponentially as more variables are measured (cf. Snook 
& Gorsuch, 1989) . For examples, with 10 measured variables there 
are 10 diagonal entries out of 100 total entries (i.e., 10.0%), 
but with 30 measured variables there are 30 diagonal entries out 
of 900 total entries (i.e., 3.3%), and with 50 measured variables 
there are 50 diagonal entries out of 2500 total entries (i.e., 
2 . 0 %) . 

Factor Retention Rules 

When variables are factored (see Campbell (1996) and Thompson 
(2000b) for a discussion of factoring people) , the total number of 
possible- factors equals the number of variables factored. However, 
because many of these factors may not contribute substantially to 
the overall solution or be interpretable, some factors are not 
useful to retain in the analysis and generally represent noise or 
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error in the variables. The goal of EFA is to retain the fewest 
possible factors while explaining the most variance of the 
observed variables. It is critical that the researcher extract the 
correct number of factors because this decision will impact 
results directly. 

Many rules can be used to determine the number of factors to 
retain (cf. Zwick & Velicer, 1986), including the eigenvalue 
greater than one rule (EV > 1; cf. Kaiser, 1960), scree test 
(Cattell , 1966), minimum average partial correlation (Velicer, 
1976), Bartlett's chi-square test (Bartlett, 1950, 1951), and 
parallel analysis (Horn, 1965; Turner, 1998) . Thompson and Daniel 
(1996) and Zwick and Velicer (1986) elaborated these approaches. 
The most frequently used method is the EV > 1 rule. As Thompson 
and Daniel noted, "This extraction rule is the default option in 
most statistics packages and therefore may be the most widely used 
decision rule, also by default" (p. 200) . 

Importantly, these rules do not necessarily lead to the same 
decision regarding the number of factors to retain. For example, 
in a Monte Carlo evaluation, Zwick and Velicer (1986) found that 
the EV > 1 rule almost always severely overestimated the number of 
factors to retain. Their findings were consistent with those of 
Cattell and Jaspers (1967), Linn (1968), Yeomans and Golder 
(1982), and Zwick and Velicer (1982), but contrary to Humphreys 
(1964) and Mote (1970), who noted that the EV > 1 rule may 
underestimate the number of factors. 

Bartlett's chi-square test was very inconsistent. The 
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statistical significance test is heavily influenced by sample 
size. Because EFA studies typically involve large sample, the test 
may have little utility. 

Despite its subjective nature in interpretation, the scree 
test was much more accurate but also tended to overextract 
factors. Importantly, parallel analysis was the most accurate 
procedure, followed closely by minimum average partial method. 
Unfortunately, these methods are seldom employed in published 
research. As an additional option, Thompson (1988) suggested using 
a bootstrap method to determine the number of factors and provided 
a program to automate the process. 

Because the factor retention decision directly impacts the 
EFA results obtained, researchers are advised to use both multiple 
criteria and reasoned reflection. Researchers should also 
explicitly inform readers about the strategies used in making 
factor retention decisions. 

Factor Rotation and Coefficient Interpretation 

Rotation strategies are' numerous and can be classified into 
two broad categories: orthogonal and oblique. Almost all 
researchers rotate their EFA results to facilitate interpretation 
of their factors. Discussion of the various rotation strategies is 
dealt with elsewhere (cf. Gorsuch, 1983; Kieffer, 1999; Stevens, 
1996) and will not be addressed here. However, one point will be 
made regarding the coefficients used when interpreting EFA 
results . 

In EFA, the contribution of a variable to a given factor is 
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indicated by both factor pattern coefficients (sometimes 
ambiguously called "loadings") and factor structure coefficients 
(also sometimes ambiguously called "loadings") . Thompson and 
Daniel (1996) noted that structure coefficients, or correlations 
between observed and latent variables, "are usually essential to 
interpretation" (p. 199) . Their sentiment applies not only to 

factor analysis but also to other general linear model analyses 
(cf. Thompson & Borrello, 1985). 

In factor analysis, the factor structure matrix gives the 
correlations between all observed variables and all extracted 
(latent) factors. When factors are orthogonally rotated, they 
remain uncorrelated and the factor structure matrix will exactly 
match the factor pattern matrix. Mathematically, the structure 
matrix is obtained by multiplying the factor pattern matrix (Pvxf) 
by the factor correlation matrix (Rfxf) , which is an identity 
matrix (i.e., ones on diagonal, zeros off diagonal) after 
orthogonal rotation. The resulting structure matrix (Svxf) will 
match the original factor pattern matrix (cf. Gorsuch, 1983, p. 
52) whenever the factor correlation matrix is an identity matrix. 
In such cases, the pattern matrix should be called the "factor 
pattern/structure matrix" to facilitate clarity. Because "loading" 
is used ambiguously in the literature, use of this term is 
proscribed by some editorial policies (Thompson & Daniel, 1996) . 

When an oblique rotation is utilized, the factors are allowed 
to correlate with each other. In such cases, the factor 
correlation matrix will not be an identity matrix, and the 
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structure matrix will not equal the pattern matrix. Appropriate 
interpretation, then, must invoke both the factor pattern and 
factor structure matrices. Because all analyses are correlational 
and belong to the general linear model, the problem of only 
interpreting the factor pattern coefficients is analogous to only 
interpreting beta weights in regression when the predictors are 
correlated. As illustrated by Thompson and Borrello (1985), 
consideration of structure coefficients is critical in such cases. 

EFA Reporting Practices 

Replication is a foundational principle of science (Henson & 
Smith, in press; Thompson, 1999) . Findings in a single study 
seldom "prove" anything, but confidence in results increases when 
independent researchers externally evaluate the validity of 
previously reported research. Regarding factor analysis, it is 
very important that researchers be able to independently evaluate 
the results obtained in an EFA study. This can, and should, occur 
on two levels. Given the myriad subjective decisions necessary in 
EFA, independent researchers should be able to evaluate the 
analytic choices of authors in the reported study. Second, 
independent researchers should be able to replicate accurately the 
study on new data, perhaps via a CFA. 

Unfortunately, such practices are not possible for most 
applications of EFA. Tinsley and Tinsley (1987) noted that most 
applied uses of factor analysis do not provide sufficient 
information to allow others to make independent interpretations. 
Too often, authors only report the final results of their factor 




16 



EFA Reporting Practices 16 



analysis, thereby eliminating the possibility of external 
evaluation of EFA decisions (cf. Comrey, 1978). Additionally, some 
authors do not report sufficient information to even allow 
independent interpretation of the final results, such as only 
giving part of the factor pattern matrix or excluding the factor 
structure matrix for oblique solutions (Thompson & Daniel, 1996; 
Tinsley & Tinsley, 1979) . Many authors have called for more 

detailed factor analytic information in published research (cf. 
Comrey, 1978, Gorsuch, 1983; Kline, 1994; Thompson & Daniel, 1996; 
Tinsley & Tinsley, 1987; Weiss, 1971). According to Hetzel (1996), 
It is generally agreed that the following 

information should be included when reporting a 
factor analysis: (a) background information, such 

as sample size, sample composition, method of 
selecting the subjects, and method of selecting the 
variables; (b) matrix of association used; (c) 
method of factor extraction; (d) initial 

communality estimates used; (e) the criteria used 
for determining the number of factors to retain; 
and (f) the method of rotation used. In addition, 
the following basic data should be included when 
reporting a factor analysis: (a) the means and 

variances of the items; (b) the matrix of 
associations among the items; (c) the rotated 
factor pattern and structure matrices; and (d) the 
final communality coefficients, eigenvalues, and 
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proportion of variance explained by each rotated 
factor, (pp. 198-199) 

However, in a review of 13 factor analysis articles in the 
Journal of Counseling Psychology , Hetzel (1996) found that much of 
this information was not reported by authors. Of course, precious 
journal space may limit information given, but critical decisions 
should nevertheless be explicitly addressed (e.g., the rule(s) 
used to determine the number of factors to retain) and complete 
information should be made available to interested persons (cf. 
Tinsley & Tinsley, 1987) . 

As noted, the purpose of the present chapter was to examine 
the EFA decisions and reporting practices in published EFA 
research. Although Hetzel (1996) characterized basic patterns of 
reporting in the counseling literature, the present review (a) 
broadened the literature studied to include both measurement and 
substantive articles and (b) considerably expanded the reporting 
practices and decisions examined. 

Method 

Journal and Article Selection 

Journals freguently employing factor analytic studies were 
identified from a search of the ERIC and PsycLIT databases using 
the keywords "factor analysis". Although many journals publish 
articles using EFA, the following four journals were selected for 
investigation because of their greater reporting frequency as 
regards EFA applications: Educational and Psychological 
Measurement ( EPM ) , Journal of Educational Psychology ( JEP ) , 
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Personality and Individual Differences ( P&ID ) , and Psychological 
Assessment (PA) . These journals also reflect both substantive and 
measurement applications of EFA. 

Fifteen uses of EFA were examined from each of the journals, 
resulting in 60 total EFAs studied. Articles were selected if they 
employed EFA; articles only using CFA were not examined. 
Additionally, if one article included more than one EFA, all EFAs 
was coded if they were substantively different in terms of the 
information reported. We began examining articles from the end of 
1999 (except for P&ID , for which only articles from volume 26, 
June 1999, and earlier were available) and worked backwards until 
15 applications of EFA from each journal were identified. A total 
of 432 articles were examined. Forty-nine articles were identified 
that used one or more EFAs, giving a total EFA sample of 60. These 
EFAs were coded on multiple criteria to assess the information 
reported and the analytic decisions made by authors. 

Results and Discussion 

Table 1 presents descriptive results for six global EFA 
variables. The sample size distribution was quite variable and 
positively skewed (coefficient of skewness = 3.07). The median 
sample size (267) would be classified as somewhere between fair 
and good according to Comrey and Lee (1992), who portrayed as a 
guide sizes of 50 as very poor, 100 as poor, 200 as fair, 300 as 
good, 500 as very good, and 1000 as excellent. Tabachnick and 
Fidell (1996) noted that, "As a general rule of thumb, it is 
comforting to have at least 300 cases for factor analysis" (p. 
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640) . 

INSERT TABLE 1 ABOUT HERE 



Stevens (1996) suggested that the number of participants per 
variable is a more appropriate way to determine sample size 
(ranging from five to 20 participants per variable) . Fewer 
participants are needed when component saturation is high. In the 
current sample, the median ratio of number of participants to 
variables was 11:1 (coefficient of skewness — 4.67), suggesting 

that most sample sizes were marginal to sufficient, depending on 
component saturation. However, seven EFAs (11.86%) had ratios less 
than Stevens' (1996) minimum of 5:1. One study failed to report 
sample size. 

The extracted factors explained, on average, 52.03% of the 
total variance in the original variables. This amount is 
drastically less than the "75% or more" recommended by Stevens 
(1996, p. 364). It is also inconsistent with Gorsuch's (1983) 
claim that " [u] sually, investigators compute the cumulative 
percentage of variance extracted after each factor is removed from 
the matrix [of association] and then stop the factoring process 
when 75, 80 or 85% of the variance is accounted for" (p. 165). 
Only the most effective EFAs in the current study met these 
criteria for variance-accounted-f or . It is unclear whether the 
modest explained variance was due to researchers failing to 
extract meaningful factors in their data or that their instruments 
failed to yield data with clear internal structure that can be 
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represented by latent constructs. This question is worthy of 
further empirical investigation. If analysts are not extracting 
the correct number of factors, subsequent results can be adversely 
impacted. If analysts' instruments are not yielding scores with 
factorial "simple structure" (Thurstone, 1935) , the construct 
validity of scores may be questionable. 

Table 2 presents frequency counts and percentages of articles 
reporting a various information. Table 2 presents both overall 
frequencies as well as those for each journal examined. For the 
sake of brevity, only the overall results will be summarized here. 
However, it should be noted that, in general, the outcomes for the 
individual journals were similar to the overall results, with the 
marked exception of article type. Article type varied considerably 
due to the different objectives, both substantive and measurement, 
of the journals examined. 

INSERT TABLE 2 ABOUT HERE 



Careful examination of Table 2 highlights many of the typical 
decisions made by researchers when conducting EFAs . Careful review 
also reveals some egregious errors concerning appropriate 
reporting practice. For example, the majority (65.0%) of authors 
failed to note what matrix of association they analyzed. Authors 
also failed to indicate their factor extraction method on eight 
occasions (13.3%). Among those reporting the extraction method 
used, most (56.7%) used principal components analysis, which is 
the default option in most statistical packages. 
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Regarding strategies used to determine the number of factors 
to retain, the EV > 1 rule was most common (56.7%) . Interestingly, 
the EV > 1 rule is also the default in most statistical packages, 
and its frequency of use mirrors that for principal components 
analysis. The scree test was frequently used (35.0%) as well. 
Largely, the other rules were ignored (or at least not reported, 
and so assumed ignored) by authors. For 10 uses of EFA, the 
authors noted that they set the number of factors to extract based 
on a priori theory. As Daniel (1989) and Kieffer (1999) noted, 
this approach is generally not appropriate for EFA, given that EFA 
does not consider the a priori considerations of the researcher in 
the analyses. A CFA would likely be more appropriate in these 
circumstances . 

Although Zwick and Velicer (1986) demonstrated that minimum 
average partial and parallel analysis were among the most accurate 
methods for determining the number of factors, most authors failed 
to use either of these methods. Minimum average partial was never 
used and parallel analysis (Turner, 1998) was used on four 
occasions (6.7%) . Given the problems with the EV > 1 rule, and 
less so with the scree test, these findings are troublesome and 
call into question whether the authors extracted the correct 
number of factors from their matrices. It is in cases such as 
these that independent evaluation of results is critical; however, 
few articles reported enough information to allow for such an 
investigation . 

Furthermore, despite Thompson and Daniel's (1996, p. 200) 
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recommendation that "[t]he simultaneous use of multiple decision 
rules is appropriate and often desirable," most authors in the 
current study (n = 33, 55.0%) only used (or at least only reported 
using) one rule. Of these, 18 invoked the EV > 1 rule, 6 used the 
scree test, 2 used parallel analysis, and 7 made decisions based 
on a priori theory. Two decision rules were explicitly considered 
in 11 EFAs and three rules were used in 7 EFAs . No authors 
reported using more than three rules. Unfortunately, authors of 9 
EFAs (15%) failed to give any indication of how they determined 
the number of factors to extract. 

Regarding factor rotation, orthogonal rotations were most 
common (55.0%), although rotation strategy was not explicitly 
justified in 61.7% of the EFAs. Varimax was the most commonly used 
specific method (51.7%). The most common oblique strategy was 
oblimin (21.7%), although the exact delta value used was not given 
in almost all of these cases (n = 10) . Gorsuch (1983) discussed 
the potential differences from using varied delta values . When 
delta was reported (n = 3) , it was always zero, the default in 
most statistical packages. 

Thirteen percent of cases did not report their specific 
rotation strategy and three failed to even indicate whether the 
rotation was orthogonal or oblique. Again, this lack of 
information severely limits external evaluation of others' work. 
The reader is left to accept the authors' findings on faith, a 
noble virtue in some contexts, but not in EFA. 

Furthermore, when oblique solutions were used (n = 23) , only 
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5 EFAs included either the factor structure matrix (n = 4) or both 
the factor pattern matrix and the structure matrix (n = 1) . The 
rest either erroneously reported only the factor pattern matrix (n 
= 11, which is insufficient for the reasons noted previously), 
none of the matrices (n = 5), or presented the matrix ambiguously 
so as to prevent the reader from knowing what matrix was given (n 
= 2) . It can only be assumed that authors used the matrices 
reported to make substantive interpretations of the factor 
structure. When only the factor pattern matrix is consulted in 
oblique solutions, incorrect interpretations are very possible 
(Gorsuch, 1983; Kieffer, 1999; Thompson & Daniel, 1996) . Structure 
coefficients are also almost always necessary for interpretation 
when factors are correlated. 

Additional errors of omission included failure to report 
communality coefficients (83.4%), variance explained for each 
factor (63.3%), and eigenvalues for each factor prior to rotation 
(51.7%). We would also suggest that external evaluation would be 
facilitated by reporting the eigenvalue for at least the first 
factor not retained. This eigenvalue would be particularly 
relevant when only the EV > 1 rule is used for extraction. Only 
5.0% of the EFAs reported this information. 

Thompson and Daniel (1996) noted that "factors should be 
given names that do not invoke the labels of observed variables 
because the latent constructs are not observed variables 
themselves" (p. 202) . Seventy-five percent of the EFAs met this 
expectation. Another aspect of factor interpretation involves how 
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many and how strongly observed variables weight on a given factor. 
At least two variables are necessary to define a factor, otherwise 
the factor would be little more than the observed variable itself. 
Although multiple items were used to define factors in most all 
cases, six of the EFAs involved factors that were defined by only 
one variable, which seems to contradict the basic idea of a factor 
as a latent construct. 

Table 3 presents descriptive information regarding the 
variance explained by extracted factors and the number of salient 
items per factor. The data are presented based on whether the 
authors reported the variance explained before or (appropriately) 
after rotation. The average number of salient items for a given 
factor was around six. 

INSERT TABLE 3 ABOUT HERE 



Finally, we also examined whether CFA was warranted as a 
potentially more appropriate analysis when the authors held a 
priori expectations concerning the factor structure. In general, 
we considered a priori theory tenable when the instrument was not 
new and when the authors had knowledge of the factor structure of' 
scores from a previous administration of the instrument. In these 
cases, CFA is arguably a preferred method given its ability to 
falsify theoretical expectations (Thompson & Daniel, 1996) . 

EFA use is more appropriate during instrument development. 

In our sample, CFA was warranted, but not used, in one-third 
of the cases. Some authors (11.7%) conducted an EFA when they had 
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theoretical expectations but then appropriately followed up the 
EFA with a CFA on an independent sample. Although CFA may not be 
tenable in some instances (e.g., small sample size), this finding 
reflects a tendency to underutilize CFA, despite its ability to 
explicitly test hypotheses and build theory. 

In EFA, only one model is tested; in CFA, multiple models can 
be pitted against each other in an attempt to falsify the 
theoretical constructs that are tested. This falsification 
potential is fundamental to construct validity and theory 
development. As noted by Thompson and Daniel (1996), 

...CFA can readily be used to test rival models and 
to quantify the fit of each rival model. Testing 
rival models is usually essential because multiple 
models may fit the same data. Of course, finding 
that a single model fits data well, whereas other 
plausible models do not, does not "prove" the 
model, since untested models may fit even better. 
However, testing multiple plausible models does 
yield stronger evidence regarding validity. (p. 

204) 

We contend that CFA should be used with greater frequency, 
and if it were, theory development would likely proceed at a 
faster pace. As Long and Brekke (1999) argued: 

Longitudinal factorial invariance has been examined 

analysis (EFA) and 
(CFA) . Of the two 



with both exploratory factor 
confirmatory factor analysis 
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approaches, CFA is preferred because it emphasizes 
a priori model testing (Bartko, Carpenter, & 
McGlashan, 1988) and avoids the factor selection 
and rotation problems of EFA (McDonald, 1985) . (p. 



498) 

Recommendations for Practice 

Based on the results obtained in the present study and the 
pleas of numerous researchers (cf. Comrey, 1978, Gorsuch, 1983; 
Kline, 1994; Thompson & Daniel, 1996; Tinsley & Tinsley, 1987; 
Weiss, 1971) , we suggest the following recommendations for 
practice when conducting and reporting an EFA. In general, 
sufficient information should be presented to allow external 
evaluation of the analysis and all analytic decisions should be 
explicitly noted. Unfortunately, these expectations were often 
unmet in the articles examined here. 

1. When prior theory exists regarding the structure of the data, 
CFA should probably be used over EFA. 

2. Always report which matrix of association was analyzed and 
the method of factor extraction used. Furthermore, the actual 
matrix of association should be reported (or made available 
upon request) to allow others to replicate the analysis. 

3. Use and report multiple criteria when determining the number 

of factors to retain. Avoid overdependence on the EV > 1 

rule. Parallel analysis and minimum average partial are 
grossly underutilized in published research and should be 
employed with greater frequency, given their utility (cf. 
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Zwick & Velicer, 1986) . Thompson and Daniel (1996) provided a 
program to conduct parallel analysis for interested readers. 
We also suggest that authors report the eigenvalue for the 
first factor not retained. 

4. Explicitly indicate which specific rotation strategy was used 
(e.g., varimax, promax). Furthermore, explicitly justify why 
an orthogonal or an oblique solution was selected. In 
general, as is the case throughout the general linear model, 
because oblique rotation requires the estimation of more 
parameters, an oblique structure will usually fit sample data 
better than will an orthogonal rotation. However, as Hetzel 
(1996) noted. 

Some researchers have argued that, all things being 
equal, orthogonal solutions are desirable. Since 
the factor pattern and the factor structure 
matrices are identical, and the factor correlation 
matrix is an identity matrix, fewer parameter 
matrices are estimated. In theory, the resulting 
parsimony should lead to more replicable results . 

(p. 194) 

5. Always report the full factor pattern/structure matrix. All 
factored items should be included. This information is needed 
to allow (a) external evaluation of analytic decisions, (b) 
others to rotate reported results to alternative rotation 
criteria, and (c) also allows for important meta-analyses of 

factor structure invariance across studies. When oblique 

/ 
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solutions are used, always report and interpret both the 
factor pattern and factor structure matrices. 

Table 4 illustrates a recommended reporting method when 
presenting orthogonal factor pattern/structure coefficients. 
Although this table does not list the justification for the 
rotation strategy or the eigenvalues of non-retained factors, 
it does present all pertinent information concerning results 
from the EFA. While this table alone would not allow an 
external researcher to reproduce a presented study, it should 
help readers understand the general design of the study and 
relevant outcome information concerning the results. 

INSERT TABLE 4 ABOUT HERE 



6. Always report communalities , the total variance explained by 
the factors, initial eigenvalues, and the variance explained 
by each factor after rotation or final traces (i.e., the 
transformed eigenvalue variance-accounted-for statistic after 
rotation) . 

7. Never name a factor with the label used for an observed 

variable. Such practice is potentially confusing and does not 
honor the fact that the factor is a latent, unobserved 
variable. Additionally, do not define a factor with only one 
item. Sufficient component saturation is needed to warrant 
factor interpretation and to assume some level of 

replicability. 
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Conclusion 

Appropriate use of EFA necessitates thoughtful researcher 
judgment concerning a number of analytic decisions. The present 
chapter reviewed some of the fundamental decisions necessary to 
conduct an EFA and examined reporting practice in published 
research. Largely, the information presented in published 
applications of EFA results tended to be too insufficient to allow 
external verification of the EFA results and researcher decisions. 
In addition, the results suggest that researchers often simply use 
the default options in common statistical packages, which may lead 
to errant results. For example, when determining the number of 
factors to retain, the default option (usually EV > 1) is among 
the weakest methods available. 

Multiple deficits in reported information were noted. Several 
recommendations for improved EFA reporting practice were also 
given, including the overall recommendation of providing 
sufficient information to allow external verification of one's EFA 
results and decisions. Historically, factor analytic technigues 
have been very useful in theory development and assessing 
construct validity (cf. Nunnally, 1978, p. 112) . The future value 
of EFA would be enhanced by (a) careful consideration of the 
choices made in the analysis and (b) reporting more complete 
information in published research. 
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Table 1 



General Descriptive 


Results of 


Exploratory 


Factor 


Analysis 


Reporting Practices 














Variable 


n 


Median 


M 


SD 


Min . 


Max . 


Sample size 


59 


267.00 


436.08 


540.74 


42 


3113 


Ratio of no. of 
participants to no. of 
variables factored 


59 


11.00 a 


26.86 


52.79 


3.25 


348.40 


No. of variables factored 


60 


20.00 


23.73 


16.70 


5 


110 


No. of factors extracted 


60 


3.00 


3.48 


1.46 


1 


7 


Cutoff used to determine 
what coefficients were 
meaningfully weighted on 
a factor 


37 


.40 


.40 


.07 


.30 


.50 


Total variance explained 


by extracted factors 


43 


51.70% 


52.03% 


14.48% 


16.70% 


87.50% 



Note . n = number of articles reporting the relevant information. 

a Indicates that there were 11 participants per one variable 
factored . 
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Table 3 

Variance Explained and Number of Items for Reported Factors 



% Variance Expl . Number of Items 



Factor 


n 


M 


SD 


n 


M 


SD 


Min . 


Max . 


Reported 


Before 


Rotation 














I 


17 


31.18 


11.06 


24 


7.54 


3.41 


3 


17 


II 


16 


10.60 


5.86 


23 


5.09 


2.70 


2 


12 


III 


9 


7.52 


3.54 


15 


4.73 


2.34 


2 


9 


IV 


7 


7.42 


2.01 


12 


5.67 


3.26 


2 


13 


V 


1 


5.20 


— 


4 


7.00 


1.83 


5 


9 


VI 








2 


10.50 


.71 


10 


11 


VII 








1 


7.00 


— 


7 


7 


Reported 


After 


Rotation 














I 


4 


29.75 


30.56 


6 


7.67 


3.67 


4 


14 


II 


4 


11.45 


1.66 


6 


6.67 


3.78 


1 


10 


III 


3 


10.00 


1.05 


5 


5.60 


2.88 


2 


8 


IV 


3 


8.00 


.10 


4 


5.00 


2.94 


1 


8 


V 


1 


10.00 


— 


2 


6.50 


.71 


6 


7 
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Table 4 

Heuristic Factor Pattern/Structure Matrix 
Rotated to the Varimax Criterion 



Variable 


Factor I 
Mechanical 


Factor II 
Spatial 


Factor III 
Verbal 


h 2 


XI 


. 685 


. 133 


. 168 


.515 


X2 


.005 


. 070 


.832 


. 697 


X3 


. 625 


.280 


.032 • 


.470 


X4 


.101 


. 688 


-.110 


.496 


X5 


. 035 


. 003 


. 850 


.724 


X6 


.489 


.358 


.252 


. 431 


XI 


.822 


. 085 


.008 


. 683 


X8 


.006 


-.002 


.780 


. 608 


X9 


.285 


. 589 


. 056 


. 431 


X10 


.100 


.785 


.021 


. 627 



Trace 
% of 


1.841 


1 . 673 


2.132 


5.646 


Variance 


18 . 4 


16.7 


21.3 


56.4 



Note . Coefficients greater than I . 4 0 1 are underlined and retained 
for that factor. Percent variance is post-rotation; because here 
there were 10 measured variables, "% of Variance" is trace divided 
by 10 times 100 (or trace times 10) . 
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